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ABSTRACT 



The logistic positive exponent family (LPEF) of models has 
been proposed by F. Samejima (1998) for dichotomous responses. This family of 
models is characterized by point-asymmetric item characteristic curves 
(ICCs) . This paper introduces the LPEF family, and discusses its usefulness 
in educational measurement and the implications of its use. Equations are 
given for the LPEF model. Two contrasting applications of the LPEF are 
discussed. One is an application in cognitive ability measurement for a 
situation in which the same task is assigned to two or more groups of 
individuals that differ in ability levels. With the LPEF, procedures of 
evaluation in each examination can be adjusted to suit each group. The other 
application is in personality or attitude measurement. Because models in the 
LPEF family are three-parameter models, it is advisable to use a 
nonparametric method to estimate the ICCs and then parameterize each of the 
resulting ICCs. (SLD) 
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I. Objective 

The logistic positive exponent family (LPEF) of models has been proposed by Samejima 
(Psychometrika, 1998b) for dichotomous responses. While many mathematical models in the 
unidimensional item response theory are represented by point-symmetric item characteristic 
curve (ICC), or the conditional probability of the correct answer, given the latent trait 6 , 
this family of models is characterized by point-asymmetric iCC’s. It should be noted that the 
former group of models includes such populaj models cis the normal ogive model, the logistic 
model, Rasch model, the three-parameter logistic model, etc. 

Although this family of models has been proposed, its imphcations and usefulness may 
not be very obvious to researchers in educational measurement. The objective of the present 
paper is to introduce the LPEF, and discuss its imphcations and usefulness in educational 
measurement. 



II. Theoretical Framework: Logistic Positive Exponent Family of 

Models 

Let 0 be the latent trait, or ability, which assumes any real number, and g denote an 
item. The ICC of a model in the LPEF is given by 



P,{e) = prob.[U, = 1] = [«,(«)]«• (,>0 , 



( 1 ) 



where Ug is a dichotomous item score which assumes either 0 (incorrect) or 1 (correct), 
and 



l + e:)cp[-Dag{9 -bg)] 



( 2 ) 



where ag is the discrimination parameter, bg is the difficulty parameter, and D{= 1.702) 
is the scahng factor. The third parameter, ^g , is called the acceleration parameter that 
characterizes this family of models. 



seven examples whose ICC’s were presented in Figure 1, and shown in Figure 2. Note that 
when (g = I , that is, in the logistic model, the IIF becomes a symmetric, unimodal curve, 
and, otherwise, those curves are unimodal but asynometric, reflecting the fact that the ICC’s 
are point- asymmetric when ^ 1 • 

III. Implications of the LPEF Models 

It is a common' practice that researchers adopt a model that provides point-symmetric 
ICC’s, which, for brevity, shall be called symmetric ICC’s. One characteristic of a symmetric 
ICC is that it treats both correct and incorrect answers symmetrically. This leads to a logical 
contradiction in ordering examinees on the latent trait or ability scale. 

Consider the maximum hkelihood estimate (MLE) of the latent trait. For the purpose 
of illustration, following the normal ogive model. Table 1 presents the 32 possible response 
patterns of five dichotomous items that are arranged in the ascending order of the MLE’s of 
the latent trait. These hypothetical items have a common discrimination parameter, = 1.0 , 
and separate, equally spaced difficulty parameters, bg = —3.0,— 1.5, 0.0, 1.5, 3.0 , respectively. 
It can be seen by dividing the 32 response patterns into two subgroups, that is, the rows 1 
through 16 and those 17 through 32, respectively, that the response patterns of the second 
group are compliments of those of the first group arranged in the reversed order. 



Insert Table 1 About Here 



It is logical to expect that the orders of MLE’s are consistent for any pair of subsets of 
responses. Table 1 indicates, however, this consistency in rank order does not exist in the 
normal ogive model. If, for example, the response pattern with a subset 101 for items 2, 



3 and 4 is ranked higher than the response pattern with another subset 110 for items 2, 3 
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It is obvious from Eqs. (1) and (2) that, when = 1 , the ICC in the LPEF becomes that 
of the logistic model. In this specific case, the ICC is represented by a point-symmetric curve, 
that is, the ICC has its point of symmetry at {bg, 0.5) and the relationship 

p,{e+) = i-p.(e-) , (3) 

holds with any real number d , where 

f = bg d 

[6- = b, - d . 

It should be noted that most mathematical models that have been widely used, such as the 
normal ogive model, the logistic model, Rasch model, 3-parameter logistic model, etc., provide 
point-symmetric ICC’s. A strength of the LPEF is that the models provide point-asymmetric 
curves when ^ 1 which do not satisfy Eq. (3), and enable them to order individuals on the 
latent trait dimension with a consistent philosophy. 



Insert Figures 1 and 2 About Here 



Figure 1 represents the ICC’s of 7 examples in the LPEF given by Eqs. (1) and (2), with 
the common discrimination and difficulty parameters Og = 1 and bg = 0 , and the separate 
acceleration parameters ^g = 0.3, 0.5, 0.8, 1.0, 1.5, 2.0, 3.0 , respectively. The item information 
function (HE) is given by 



m = 



[p'M 



Pg{9) [i-Pg{e)] 



( 4 ) 



for dichotomous response models in general, where P'{9) indicates the first derivative of Pg{9) 
with respect to 9 . Substituting Eq. (1) and (2) into Eq. (4) the HP’s were obtained for the 
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and 4 in one environment, it is expected that the same rank order should exist in any other 
environments. Table 1 shows that, while the above rank order holds for the response patterns 
01010 (#10) and 01100 (#6) and also for 01011 (#21) and 01101 (#19), the reversal of 
the rank order occurs for 11010 (#24) and 11100 (#25), and also for 11011 (#29) and 
11101 (#30). 

The same contradiction can be observed from another angle. It is noted in Table 1 that 

1. The five response patterns, each of which contains only one correct response, are 
arranged in the order of diflaculty of the item that is answered correctly, and 

2. The five response patterns, each of which contains four correct responses are arranged 
in the order of diflaculty of the item that is not answered correctly. 

These two principles are contradictory to each other. If the first principle is accepted, then we 
should expect that, out of the five response patterns that have four correct answers each, the 
response pattern with the four most diflacult items answered correctly to receive the highest 
ability estimate. However, if the second principle is true, then we should expect that, out of 
the five response patterns that have only one correct answer each, the response pattern with 
the easiest item answered correctly to receive the highest rank. 

The above are just two examples, but the reversal of the two principles in assigning MLE’s is 
seen in other response patterns also. These contradictions are intrinsic in all symmetric ICC’s, 
with the exception of the logistic model, in which the MLE is not affected by the difl&culty 
parameters, bg ’s for g — 1,2, ...,n (see Table 1). The contradiction in the rank order of 
response patterns does not exist in models of the LPEF, that provide asymmetric ICC’s except 
for = 1 , however. 

It is noted in Figure 1 that when ifg < 1 the ICC assumes higher values than the logistic 
ICC for the entire range of 0 , and enhancement becomes larger as gets less. Since Eq. (1) 
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can be written as 



p.m = [*,(«)]'• = »,(«) + - 1 ) *,(«) 0 < {, < 1 , ( 5 ) 

— 1] (> 0) can be considered as the conditional elevation ratio, given 6 , which 
is strictly decreasing in 9 and also strictly decreasing in (g . In other words, if an item 
has an ICC given by Eq. (1) with very small positive (g , then even individuals on very low 
ability levels have substantially high probabilities to pass the item. Thus it will be a natural 
expectation that, when a test consists of items with common Og and (g (< 1) and different 
bg ’s , principle of penalizing failure in solving easier items should be consistently followed. This 
is confirmed by the examples illustrated in Table 2, in which ^g = 0.3, 0.5, 0.8 . 



Insert Table 2 About Here 



It is a logical consequence that, for the same response pattern, the values of MLE are 
different, depending on the values of ^g ’s ; for a smaller ^g the value of MLE is lower. This 
is well illustrated in Table 2. For example, for the response pattern 10111 the values of MLE 
are —0.81381, 0.76848 and 1.89136 for = 0.3, 0.5, 0.8 , respectively. Note that all these 
values of MLE are less than 2.28753 , the value of MLE when = 1.0 , i.e., in the logistic 
model (see Table 1). 

When > 1 , the ICC’s assume lower values than the logistic ICC for all 9 , as are 
illustrated in Figure 1 for (g = 1.5, 2.0, 3.0 . Since Eq. (1) can be rewritten in the form 



PM = [«,(«))‘' = *.(«) - [1 - *,(«) & > 1 , ( 6 ) 

[1 - {«,(«)}«»-') (> 0) can be considered as the conditional drop ratio, which is strictly 
decreasing in 9 and strictly increasing in (g . In other words, if an item has an ICC given 
by Eq. (1) with large positive ^p , then even individuals with very high ability levels have a 
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substcintially low probability to pass the item. Thus it will be reasonable to expect that when 
a test consists of items with common Ug and (> 1) and different bg ’s , the philosophy 
of giving credits to the success in solving more difficult items should consistently hold. This 
principle is confirmed by the examples in Table 3, in which ^g = 1.5, 2.0, 3.0 . 



Insert Table 3 About Here 



As is the case with < 1 , it is a logical consequence that, for the same response pattern, 
the values of MLE axe different depending on the values of ^g . Again for a smaller ^g the 
value of MLE is lower, as illustrated in Table 3. For example, for the same response pattern 
10111 that was illustrated earlier, the values of MLE are 2.84408, 3.14744 and 3.50199 for 
^g = 1.5, 2.0, 3.0 , respectively. Note that all these values of MLE are higher than those three 
counterpaxts for ^g = 0.3, 0.5, 0.8 and also the value of MLE in the logistic model. 

The logistic model that is obtained by setting = I in Eq. (1) can be interpreted, therefore, 
as a transition between the two opposing principles in the LPEF, and in this specific case both 
principles are degenerated. Thus item difficulties wiU not affect the order of MLE’s. 

IV. Usefulness of the LPEF Models 

One concern may be in what cases the models in LPEF should appropriately be adopted. In 
this section, two contrasting applications of the LPEF will be given and discussed. It is hoped 
that readers wiU use them as hints, expand their imaginations, use analogies, etc., in order to 
find a use for LPEF models in their own research. 

[IV. 1] An Application in Cognitive Ability Measurement 

Suppose there are two training programs for a certain computer language. In each program, 
the trainees’ progresses are evaluated by having them write actual computer programs of the 



same set of contents, using the language they have learned. In one training program, these 
exams are given with the instructor’s simple and straight-forward explanations of the content 
of the target computer program, and the trainees are supposed to write a computer program 
on their own. When each trainee decides that his/her program should run correctly, it will be 
handed in. In the other training program, the trainees are allowed to use the programs they 
have written with data to find out if they actually run, and if the programs do not run weU 
he/she can trouble-shoot and modify it up to, say, five times, and then the printout after the 
fifth revision should be handed in. 

Because the content of each computer program has its own difficulty level, it wiU be repre- 
sented by its difficulty parameter. Since the evaluation procedures in the two training programs 
are substantially different for the same contents of exams the values of the acceleration param- 
eter should be expected to be different for the two different training programs. 

It should be noted that, in the first training program, the trainees must take all factors into 
consideration and produce a correct computer program in the first trial without any feedback 
information. Thus only trainees who have very high programing abihty have a high probabihty 
to pass the exam, and passing the exam deserves high credit. Thus an LPEF model with 
^3 > 1 will fit. In the second training program, since the trainees are allowed to make 
mistakes, trouble-shoot and make revisions up to five times, even those on relatively lower 
levels of ability will have a high probability to pass the exam. Thus penahzation of the failure 
in writing a useable program should be emphasized. An LPEF model with 0 < ^3 < 1 wiU be 
suitable in such a case. 

Usefulness of LPEF models is pronounced in this example in the sense that, when the same 
task is assigned to two or more groups of individuals that differ in ability levels, procedures of 
evaluation in each exam can be adjusted to suite each group. These different instructions wiU 
affect the parameter ^3 for each group of individuals. Note that the same response pattern 
for the same set of items will not provide the same MLE for the two or more training programs 



eis was observed earlier, and yet these estimated ability levels of individuals in these sepaxate 
programs can still be located on the same abihty dimension. 

For example, if there are five tests in the training programs and the acceleration parameter 
cissumes 2.0 in the first program and 0.5 in the second, the MLE will be 0.77745 in the 
first program for the pass-fail pattern of 00110 , while it will be —2.59861 in the second 
program for the same pass-fail pattern (see Tables 2 and 3). For the seven different values of 
the acceleration parameter, = 0.3, 0.5, 0.8, 1.0, 1.5, 2.0, 3.0 , that were cited eaxlier, the values 
of MLE for this specific pass-fail pattern are —3.62818 , —2.59861 , —1.39938 , —0.75260 , 

0.24694 , 0.77745 and 0.1.33889 , respectively. It is noted that the MLE increases with , 
and this relationship holds for any pass-fail pattern. 

There is a possibility that this relationship between the pass-fail pattern and the MLE 
gives an unqualified disadvantage to a bright individual. Suppose that a bright individual is 
misclassified into the second training program, and this person’s pass-fail pattern turned out 
to be 11110 . If (g = 0.5 in the second program as was exemplified earlier, then his MLE 
will be 1.76665 . Suppose, further, that for items 1 through 4 this subject actually completed 
the computer programs without even running data to confirm that the programs were right. 
In such a case this individual would have got the same pass-fail pattern, 11110 , had he/she 
been put into the first training program where = 2.0 ; and yet he/she will get unfairly low 
value of 1.76665 as his/her estimated abihty level, instead of 2.76207 . 

A solution for this problem will be the use of graded scores. For example, scores can be 
given in such a way that those who wrote a useable computer program: 

1. on their own get score 6, 

2. after one set of running data and trouble-shooting get score 5, 

3. after two sets of the above process get score 4, 

4. after three sets of the above process get score 3, 

5. after four sets of the above process get score 2, and 



6. after five sets of the above process get score 1, 

7. those who failed in writing a useable computer program even after five sets of the 
above process get score 0. 

Thus a LPEF model on the graded response level (Samejima, 1997) will be applied. In 
this way the possibihty of unquahfied disadvantage for bright individuals will disappear, and 
there is no need to use two separate training program either. A trade-off is that the evaluation 
process will becorrie more comphcated, and a stricter supervision by the tester will be needed. 

[IV.2] An Application in Personality or Attitude Measurement 

It is desirable that in any personality or attitude measurement that our inventory or ques- 
tionnaire should measure a wide range of the latent trait accurately, whether it is a specific 
personality scale or an attitude scale toward a specific topic. This accuracy of measurement 
can be evaluated locally for each scale, or as a function of the latent trait 9 . This is done by 
the use of the inverse of the square root of the test information function, I (9) , which is given 

by 

m = t, w) . ( 7 ) 

9=1 

where Ig{9) is the item information function provided by Eq. (4), as the local standard error 
of estimation. 

For the purpose of illustration. Figures 3 presents the test information function for each of 
the seven hypothetical tests (or inventories or questionaires). Each test consists of thirteen 
dichotomous items, with a common discrimination parameter Cg = 1 , and the difficulty 
parameter bg varies from -3 to -1-3 with the interval width of 0.5 . The acceleration parameter 
^g varies for separate tests, and they are 0.3, 0.5, 0.8, 1.0, 1.5, 2.0 or 3.0 , respectively. 



Insert Figures 3 and 4 About Here 



It is obvious from Figure 3 that, except for the lower range of 6 , the amount of information 
becomes larger when the acceleration parameter is higher. Actually these discrepancies axe 
a little exaggerated, for it is not I{0) but its square root that is counted. Figure 4 presents 
\Jl{9) of the same seven hypothetical tests. 

The local standard error of estimation, , for each the same seven hypothetical tests 

is presented as Figure 5. This figure is informative. For example, approximating the conditional 
distribution of MLE, given 6 , by the normal distribution with the mean 0 and the standard 
deviation , the 68 percent confidence interval at ^ = 2.0 is (1.53,2.47) , while 

it is (1.11,2.89) when ig = 0.3 , indicating that in the latter case estimation of 0 is less 
accurate than in the former. The relative widths of the confidence intervals are reversed at, 
say, ^ = —3.5 where they are (-5.02,-1.98) and (-4.41, -2.59) , respectively. 



Insert Figure 5 About Here 



Observations that were made above indicate that, in order to measure the latent trait rea- 
sonably accurately for a wide range of 0 it will be desirable to mix items with varieties of 
different values of ^g . To realize this, we must look into the items to see if there is a possibihty 
to adjust the value of ^g . 

Take the Minnesota Multiphasic Personality Inventory (MMPI) as an example. MMPI 
basically consists of ten personahty scales, such as depression, schizophrenia, social introversion, 
etc., and each scale has its own set of statements or items. Each statement is written as a first- 
person singular sentence, and the examinee is expected to answer these questions either “true” 
or “false” (with an additional category of “cannot say”). Consider the following four example 
statements (Rogers, T. B., 1995): 



1. I am concerned about sex matters. 



2. Some of my family have habits that bother me very much. 

3. It takes a lot of argument to convince most people that they are wrong. 

4. I wish I were not as shy as I am. 

It will be reasoned that if we change item 3 to the sentence: 

(3a) It takes some argument to convince most people that they are wrong, 

the ICC will be changed also, and most likely the value of becomes less, inviting more 
individuals on lower levels of 6 to answer “true.” This will also be the case with item 1, and 
if it is changed to: 

(2a) Some of my family have habits that bother me, 

the value of will be shifted in the same direction. On the other hand, if item 1 is changed 
to: 

(la) I am concerned about sex matters very much, 

then the value of ^g will become higher. These predictions will be confirmed or disconfirmed 
by estimating the iCC’s of both the original and revised items in appropriate pilot studies, 
and comparing the two resultant estimates of ICC. It can be seen that such modifications are 
possible with many items in personality or attitute measurement. In contrast, item 4 may not 
have room for modification as the other three items do. It should be expected, therefore, that 
modifications are not possible for all items. If a large number of statements have room for 
modification, then it will be possible to modify or develop an inventory that hcis a sufficiently 
small and practically constant standard error of estimation over a wide range of 0 . 

V. Conclusions and Scientific Importance 



Models in the LPEF are three-parameter models, so it is advisable to use a nonparametric 



method (e.g., Samejima, 1998a) for estimating ICG’s, and then parameterize each of the re- 
sulting ICC’s. This procedure will ameliorate indeterminancy of the parameter estimates that 
is unavoidable when the model contains more than two parameters. 

There is a gap between psychometricians who actively propose new mathematical models 
and researchers who apply mathematical models in educational measurement, and thus valid 
mathematical models are often overlooked by the latter group of researchers. Since mathemat- 
ical models are useless unless they are validly used in empirical research, including educational 
measurement, explanations of the natures, implications and usefulness of a specific model will 
be important. The proposed paper is believed to have scientific importance in this regard. 
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TABLE 1 



MLE’s of 0 Based on 32 Response Patterns of 5 Dichotomous Items Following the 
Normal Ogive Model and the Logistic Model with the Item Parameters ag = 1.0 for 
All Items and bg = -3.0, -1.5, 0,0, 1.5, 3.0 , Respectively, Arranged in the Ascending 

Order of Those in the Normal Ogive Model. 
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Item information functions of the seven items whose item characteristic curves are 

shown in Figure 1. 
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Test information function of seven hypothetical tests of 13 items each, following LPEF 
models with (g = 0.3, 0.5, 0.8, 1.0, 1.5, 2.0, 3.0 , respectively, with the common 
discrimination parameter = 1 and the common set of 13 difficulty parameters, 
bg = -3.0, -2.5, -2.0, -1.5, -1.0, -0.5, 0.0, 0.5, 1.0, 1.5, 2.0, 2.5, 3.0 . 
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Square roots of the test information functions of the seven hypothetical tests of 13 
items each, following LPEF models whose item parameters are the same as those 

described in Figure 3. 
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